Weight Optimization for Bimodal Unit-Selection Talking Head Synthesis

نویسندگان

Asterios Toutios

Utpala Musti

Slim Ouni

Vincent Colotte

چکیده

This paper addresses talking head synthesis based on the concatenation of units comprising of both acoustic and visual information. Selection of appropriate diphone units to synthesize a given text string is based on the minimization of a weighted linear combination of four costs that reflect linguistic, acoustic, and visual considerations. We present initial work toward a method to determine automatically the weights applied to each cost, using a series of metrics that assess quantitatively the performance of synthesis.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Realistic facial animation system for interactive services

This paper presents the optimization of parameters of talking head for web-based applications with a talking head, such as Newsreader and E-commerce, in which the realistic talking head initiates a conversation with users. Our talking head system includes two parts: analysis and synthesis. The audiovisual analysis part creates a face model of a recorded human subject, which is composed of a per...

متن کامل

Optimization of an Image-Based Talking Head System

This paper presents an image-based talking head system, which includes two parts: analysis and synthesis. The audiovisual analysis part creates a face model of a recorded human subject, which is composed of a personalized 3D mask as well as a large database of mouth images and their related information. The synthesis part generates natural looking facial animations from phonetic transcripts of ...

متن کامل

Image-based Talking Head: Analysis and Synthesis

In this paper, our image-based talking head system is presented, which includes two parts: analysis and synthesis. In the analysis part, a subject reading a predefined corpus is recorded first. The recorded audio-visual data is analyzed in order to create a database containing a large number of normalized mouth images and their related information. The synthesis part generates natural looking t...

متن کامل

"Mask-bot": A life-size robot head using talking head animation for human-robot communication

In this paper, we introduce our life-size talking head robotic system, “Mask-bot”, developed as a platform to support and accelerate human-robot communication research. The “Mask-bot” hardware consists of a semi-transparent plain mask, a portable LED projector with a fish-eye conversion lens mounted behind the mask, a pan-tilt unit and a mounting base. The hardware is driven by a software anima...

متن کامل

A Framework for Data-driven Video-realistic Audio-visual Speech-synthesis

In this work, we present a framework for generating a video-realistic audio-visual “Talking Head”, which can be integrated in applications as a natural Human-Computer interface where audio only is not an appropriate output channel especially in noisy environments. Our work is based on a 2D-video-frame concatenative visual synthesis and a unit-selection based Text -to-Speech system. In order to ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2011

Weight Optimization for Bimodal Unit-Selection Talking Head Synthesis

نویسندگان

چکیده

منابع مشابه

Realistic facial animation system for interactive services

Optimization of an Image-Based Talking Head System

Image-based Talking Head: Analysis and Synthesis

"Mask-bot": A life-size robot head using talking head animation for human-robot communication

A Framework for Data-driven Video-realistic Audio-visual Speech-synthesis

عنوان ژورنال:

اشتراک گذاری